{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Example Notebook for RAG (Retrieval-Augmented Generation) Agent Usage"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Query the RAG agent using the cell magic `%%ask` command"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [],
   "source": [
    "# %load_ext msticpy.aiagents.mp_docs_rag_magic\n",
    "# Or use:\n",
    "%reload_ext msticpy.aiagents.mp_docs_rag_magic"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2024-07-30 15:48:19,414 - autogen.agentchat.contrib.retrieve_user_proxy_agent - INFO - \u001b[32mUse the existing collection `MSTICpy_Docs_2.12.0`.\u001b[0m\n",
      "2024-07-30 15:48:27,518 - autogen.agentchat.contrib.retrieve_user_proxy_agent - INFO - Found 384 chunks.\u001b[0m\n"
     ]
    },
    {
     "data": {
      "text/markdown": [
       "\n",
       "**Question**: What are the three things that I need to connect to Microsoft Sentinel Query Provider?"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "\n",
       "**Answer**: To connect to the Microsoft Sentinel Query Provider, you need the following three things:\n",
       "\n",
       "1. A `QueryProvider` instance.\n",
       "2. The data environment string (\"MSSentinel\" for Microsoft Sentinel).\n",
       "3. A connection string or authentication parameters.\n",
       "\n",
       "Sources: C:\\Users\\t-egarcia\\Documents\\Forked MSTICpy Repo\\msticpy\\docs\\source\\data_acquisition\\DataProviders.rst"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "%%ask \n",
    "What are the three things that I need to connect to Microsoft Sentinel Query Provider?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "\n",
       "**Question**: How do I connect to the M365 Defender query provider?"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "\n",
       "**Answer**: To connect to the M365 Defender query provider, you need to follow these steps:\n",
       "\n",
       "1. Ensure your connection details are specified in the `msticpyconfig.yaml` file.\n",
       "\n",
       "2. Create a `QueryProvider` instance for M365 Defender.\n",
       "\n",
       "3. Call the `connect()` method on the instance.\n",
       "\n",
       "Here's an example:\n",
       "\n",
       "```python\n",
       "from msticpy.data import QueryProvider\n",
       "\n",
       "# Create a QueryProvider instance\n",
       "mdatp_prov = QueryProvider(\"M365D\")\n",
       "\n",
       "# Connect to the M365 Defender instance using the configured details\n",
       "mdatp_prov.connect()\n",
       "```\n",
       "\n",
       "If you have multiple instances configured, specify the instance name when calling `connect()`:\n",
       "\n",
       "```python\n",
       "mdatp_prov.connect(instance=\"Tenant2\")\n",
       "```\n",
       "\n",
       "If you prefer to pass connection parameters directly, use keyword arguments:\n",
       "\n",
       "```python\n",
       "# Collect credentials\n",
       "ten_id = input('Tenant ID')\n",
       "client_id = input('Client ID')\n",
       "client_secret = input('Client Secret')\n",
       "\n",
       "# Create a QueryProvider instance\n",
       "mdatp_prov = QueryProvider('M365D')\n",
       "\n",
       "# Connect using collected credentials\n",
       "mdatp_prov.connect(tenant_id=ten_id, client_id=client_id, client_secret=client_secret)\n",
       "```\n",
       "\n",
       "Alternatively, you can use a connection string:\n",
       "\n",
       "```python\n",
       "# Define a connection string\n",
       "conn_str = (\n",
       "    \"tenant_id='243bb6be-4136-4b64-9055-fb661594199a'; \"\n",
       "    \"client_id='a5b24e23-a96a-4472-b729-9e5310c83e20'; \"\n",
       "    \"client_secret='[PLACEHOLDER]'\"\n",
       ")\n",
       "\n",
       "# Create a QueryProvider instance\n",
       "mdatp_prov = QueryProvider('M365D')\n",
       "\n",
       "# Connect using the connection string\n",
       "mdatp_prov.connect(conn_str)\n",
       "```\n",
       "\n",
       "Sources: C:\\Users\\t-egarcia\\Documents\\Forked MSTICpy Repo\\msticpy\\docs\\source\\data_acquisition\\DataProv-MSDefender.rst"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "%%ask\n",
    "How do I connect to the M365 Defender query provider?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "\n",
       "**Question**: What do I need to add to my msticpyconfig.yaml config for the Azure Resource Graph query provider?"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "\n",
       "**Answer**: To add Azure Resource Graph to your `msticpyconfig.yaml` configuration, include the following under the `Azure` section:\n",
       "\n",
       "```yaml\n",
       "Azure:\n",
       "  auth_methods:\n",
       "  - cli\n",
       "  - interactive\n",
       "  cloud: global\n",
       "```\n",
       "\n",
       "For more information on configuring `msticpyconfig.yaml`, refer to the MSTICPy documentation.\n",
       "\n",
       "Sources: C:\\Users\\t-egarcia\\Documents\\Forked MSTICpy Repo\\msticpy\\docs\\source\\data_acquisition\\ResourceGraphDriver.rst"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "%%ask\n",
    "What do I need to add to my msticpyconfig.yaml config for the Azure Resource Graph query provider?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### A response of `UPDATE_CONTEXT` indicates that the agents are unable to answer the query with the information retrieved by the RAG agent."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "\n",
       "**Question**: Does the Splunk query provider support device code authentication?"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "\n",
       "**Answer**: UPDATE CONTEXT"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "%%ask\n",
    "Does the Splunk query provider support device code authentication?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "\n",
       "**Question**: How can I plot IP addresses in this dataframe on a map?"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "\n",
       "**Answer**: To plot IP addresses in a DataFrame on a map using MSTICpy's FoliumMap, you can use the `mp_plot.folium_map` pandas accessor. Here's an example:\n",
       "\n",
       "```python\n",
       "# Plotting IP addresses using the mp_plot.folium_map accessor\n",
       "geo_loc_df.mp_plot.folium_map(ip_column=\"IPAddress\")\n",
       "```\n",
       "\n",
       "This will display an interactive map with markers based on the IP addresses in the \"IPAddress\" column of your DataFrame.\n",
       "\n",
       "Sources: C:\\\\Users\\\\t-egarcia\\\\Documents\\\\Forked MSTICpy Repo\\\\msticpy\\\\docs\\\\source\\\\visualization\\\\FoliumMap.rst"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "%%ask \n",
    "How can I plot IP addresses in this dataframe on a map?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "\n",
       "**Question**: How do I create a new custom data provider with msticpy?"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "\n",
       "**Answer**: To create a new custom data provider with MSTICpy, follow these main steps:\n",
       "\n",
       "1. **Write the driver class:** Derive it from `DriverBase` and implement the methods `__init__`, `connect`, `query`, and optionally `query_with_results`.\n",
       "2. **Customize the driver (optional):** Expose attributes via `QueryProvider`, and implement custom parameter formatting and query parameter substitution if needed.\n",
       "3. **Register the driver:** Update the `DataEnvironment` enum and add an entry to the driver dynamic load table.\n",
       "4. **Add queries:** Create a folder named after your `DataEnvironment` and add your query files there.\n",
       "5. **Add settings definition:** Define settings in a YAML configuration file.\n",
       "6. **Create documentation:** Document the configuration and use of the data provider.\n",
       "7. **Create unit tests:** Add unit tests using mocks to simulate service responses.\n",
       "\n",
       "For detailed guidance on these steps, refer to the provided MSTICpy documentation related to data providers.\n",
       "\n",
       "Sources: WritingDataProviders.rst, PluginFramework.rst, ExtendingMsticpy.rst"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "%%ask \n",
    "How do I create a new custom data provider with msticpy?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "\n",
       "**Question**: How do I list which TI providers are currently enabled?"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "\n",
       "**Answer**: ### Step 1: Intent\n",
       "The user's intent is to get help with **question answering**.\n",
       "\n",
       "### Step 2: Answer\n",
       "To list which Threat Intelligence (TI) providers are currently enabled in MSTICpy, you can inspect the configuration typically found in the `msticpyconfig.yaml` file under the `TIProviders` section. This configuration file determines which providers are set up and whether they are marked as primary/secondary.\n",
       "\n",
       "Sources: `C:\\\\Users\\\\t-egarcia\\\\Documents\\\\Forked MSTICpy Repo\\\\msticpy\\\\docs\\\\source\\\\extending\\\\WritingTIAndContextProviders.rst`"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "%%ask \n",
    "How do I list which TI providers are currently enabled?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "\n",
       "**Question**: How do I lookup threat intelligence for multiple IP addresses at once?"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "\n",
       "**Answer**: Step 1: User's intent is to generate code for performing threat intelligence lookups for multiple IP addresses at once.\n",
       "\n",
       "Step 2:\n",
       "```python\n",
       "from msticpy.context.ip_utils import ip_whois\n",
       "\n",
       "# List of IP addresses to lookup\n",
       "ip_list = [\"123.1.2.3\", \"124.5.6.7\"]\n",
       "\n",
       "# Performing Whois lookup for multiple IP addresses\n",
       "whois_data = ip_whois(ip_list)\n",
       "print(whois_data)\n",
       "```\n",
       "\n",
       "Sources: C:\\Users\\t-egarcia\\Documents\\Forked MSTICpy Repo\\msticpy\\docs\\source\\data_acquisition\\IPWhois.rst"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "%%ask \n",
    "How do I lookup threat intelligence for multiple IP addresses at once?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "\n",
       "**Question**: How do I use pivot functions?"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "\n",
       "**Answer**: To use pivot functions in MSTICpy, you have two primary options: creating persistent pivot function definitions in YAML files or adding ad hoc pivot functions directly in code. Here's a brief overview of both methods:\n",
       "\n",
       "**1. Persistent Pivot Function Definitions**\n",
       "\n",
       "- Define your pivot function properties in a YAML file with a top-level element `pivot_providers`.\n",
       "- Example YAML definition:\n",
       "\n",
       "```yaml\n",
       "pivot_providers:\n",
       "  who_is:\n",
       "    src_module: msticpy.context.ip_utils\n",
       "    src_func_name: get_whois_df\n",
       "    func_new_name: whois\n",
       "    input_type: dataframe\n",
       "    entity_map:\n",
       "      IpAddress: Address\n",
       "    func_df_param_name: data\n",
       "    func_df_col_param_name: ip_column\n",
       "    func_out_column_name: query\n",
       "    func_static_params:\n",
       "      all_columns: True\n",
       "      show_progress: False\n",
       "    func_input_value_arg: ip_address\n",
       "```\n",
       "\n",
       "- Load and register the definition using:\n",
       "\n",
       "```python\n",
       "from msticpy.init.pivot_core.pivot import Pivot\n",
       "Pivot.register_pivot_providers(pivot_reg_path=path_to_your_yaml, namespace=globals(), def_container=\"my_container\", force_container=True)\n",
       "```\n",
       "\n",
       "**2. Ad Hoc Pivot Functions in Code**\n",
       "\n",
       "- Add a function as a pivot using the `add_pivot_function` method:\n",
       "\n",
       "```python\n",
       "def my_func(input: str):\n",
       "    return input.upper()\n",
       "\n",
       "Pivot.add_pivot_function(\n",
       "    func=my_func,\n",
       "    container=\"change_case\",\n",
       "    input_type=\"value\",\n",
       "    entity_map={\"Host\": \"HostName\"},\n",
       "    func_input_value_arg=\"input\",\n",
       "    func_new_name=\"upper_name\",\n",
       ")\n",
       "```\n",
       "\n",
       "- Alternatively, use the `PivotRegistration` class:\n",
       "\n",
       "```python\n",
       "from msticpy.init.pivot_core.pivot_register import PivotRegistration\n",
       "\n",
       "def my_func(input: str):\n",
       "    return input.upper()\n",
       "\n",
       "piv_reg = PivotRegistration(\n",
       "    input_type=\"value\",\n",
       "    entity_map={\"Host\": \"HostName\"},\n",
       "    func_input_value_arg=\"input\",\n",
       "    func_new_name=\"upper_name\"\n",
       ")\n",
       "Pivot.add_pivot_function(my_func, piv_reg, container=\"change_case\")\n",
       "```\n",
       "\n",
       "**Running Pivots in DataFrame Pipelines:**\n",
       "\n",
       "```python\n",
       "(\n",
       "    my_df\n",
       "    .query(\"UserCount > 1\")\n",
       "    .mp_pivot.run(IpAddress.util.whois, column=\"Ioc\")\n",
       "    .drop_duplicates()\n",
       ")\n",
       "```\n",
       "\n",
       "- Use `mp_pivot.run` to integrate pivot functions into DataFrame processing pipelines.\n",
       "- Join input and output DataFrames with the `join` parameter in `mp_pivot.run`.\n",
       "\n",
       "**Debugging Tools:**\n",
       "\n",
       "- `mp_pivot.display` for intermediate results.\n",
       "- `mp_pivot.tee` for creating snapshots.\n",
       "- `mp_pivot.tee_exec` for executing intermediate operations (e.g., plotting).\n",
       "\n",
       "Sources: C:\\\\Users\\\\t-egarcia\\\\Documents\\\\Forked MSTICpy Repo\\\\msticpy\\\\docs\\\\source\\\\extending\\\\PivotFunctions.rst, C:\\\\Users\\\\t-egarcia\\\\Documents\\\\Forked MSTICpy Repo\\\\msticpy\\\\docs\\\\source\\\\data_analysis\\\\PivotFunctions.rst, C:\\\\Users\\\\t-egarcia\\\\Documents\\\\Forked MSTICpy Repo\\\\msticpy\\\\docs\\\\source\\\\api\\\\msticpy.init.pivot.rst"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "%%ask \n",
    "How do I use pivot functions?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "\n",
       "**Question**: Which columns do I need in a dataframe to plot process trees?"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "\n",
       "**Answer**: To plot process trees, the required columns in a DataFrame are typically:\n",
       "\n",
       "1. `ParentProcessName`\n",
       "2. `Process`\n",
       "\n",
       "Additional attributes such as `SubjectUserName`, `SubjectDomainName`, `SubjectLogonId`, `NewProcessName`, `CommandLine`, and `TimeGenerated` can be used for more detailed visualization and analysis.\n",
       "\n",
       "Sources: C:\\Users\\t-egarcia\\Documents\\Forked MSTICpy Repo\\msticpy\\docs\\source\\visualization\\NetworkGraph.rst"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "%%ask \n",
    "Which columns do I need in a dataframe to plot process trees?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "\n",
       "**Question**: What kind of visualizations does msticpy support?"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "\n",
       "**Answer**: MSTICPy supports various visualizations including interactive timelines, process trees, multi-dimensional Morph Charts, data viewers, matrix plots, network plots, and several others listed under the `msticpy.vis` package.\n",
       "\n",
       "Sources: C:\\Users\\t-egarcia\\Documents\\Forked MSTICpy Repo\\msticpy\\docs\\source\\index.rst, C:\\Users\\t-egarcia\\Documents\\Forked MSTICpy Repo\\msticpy\\docs\\source\\visualization\\MorphCharts.rst"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "%%ask \n",
    "What kind of visualizations does msticpy support?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "\n",
       "**Question**: How do I add a new query for Microsoft 365 Defender to msticpy?"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "\n",
       "**Answer**: To add a new query for Microsoft 365 Defender (M365D) to MSTICPy, you should use the `QueryProvider` class. Here's a step-by-step guide on how to achieve it:\n",
       "\n",
       "1. **Initialize the `QueryProvider` for M365D**:\n",
       "   ```python\n",
       "   from msticpy.data import QueryProvider\n",
       "\n",
       "   mdatp_prov = QueryProvider(\"M365D\")\n",
       "   ```\n",
       "\n",
       "2. **Connect to the M365 Defender API**:\n",
       "   ```python\n",
       "   mdatp_prov.connect()\n",
       "   ```\n",
       "\n",
       "3. **Add your new query**:\n",
       "   You can add new queries to the query store of `QueryProvider`. Here’s an example of how to define and add a new query:\n",
       "   ```python\n",
       "   new_query = \"\"\"\n",
       "   DeviceEvents\n",
       "   | where ActionType == \"FileCreated\"\n",
       "   | limit 10\n",
       "   \"\"\"\n",
       "   mdatp_prov.add_query(\"GetRecentFileCreatedEvents\", new_query)\n",
       "   ```\n",
       "\n",
       "4. **Run the newly added query**:\n",
       "   ```python\n",
       "   results = mdatp_prov.exec_query(\"GetRecentFileCreatedEvents\")\n",
       "   print(results)\n",
       "   ```\n",
       "\n",
       "In summary, you need to instantiate a `QueryProvider` object for M365D, connect to the API, add the new query, and then execute the query.\n",
       "\n",
       "Sources: C:\\Users\\t-egarcia\\Documents\\Forked MSTICpy Repo\\msticpy\\docs\\source\\data_acquisition\\DataProv-MSDefender.rst"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "%%ask \n",
    "How do I add a new query for Microsoft 365 Defender to msticpy?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "\n",
       "**Question**: Which msticpy module contains the code related to visualizing network graphs?"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "\n",
       "**Answer**: The MSTICpy module that contains the code related to visualizing network graphs is `msticpy.vis.network_plot`.\n",
       "\n",
       "Sources: C:\\\\Users\\\\t-egarcia\\\\Documents\\\\Forked MSTICpy Repo\\\\msticpy\\\\docs\\\\source\\\\api\\\\msticpy.vis.network_plot.rst"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "%%ask\n",
    "Which msticpy module contains the code related to visualizing network graphs?"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "internshipenv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.14"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
